Multi-Document Summarization By Sentence Extraction

نویسندگان

Jade Goldstein-Stewart

Vibhu O. Mittal

Jaime G. Carbonnell

Mark Kantrowitz

چکیده

This paper discusses a text extraction approach to multidocument summarization that builds on single-document summarization methods by using additional, available in-, formation about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Our approach addresses these issues by using domainindependent techniques based mainly on fast, statistical processing, a metric for reducing redundancy and maximizing diversity in the selected passages, and a modular framework to allow easy parameterization for different genres, corpora characteristics and user requirements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Results of CRL/NYU System at DUC-2003 and an Experiment on Division of Document Sets

We participated in three multi-document summarization tasks at the DUC-2003 formal run and evaluated the performance of our summarization system. Our summarization system based on sentence extraction also incorporated a module to estimate similarity between sentences for multi-document summarization. The similarity information was used for selecting the representative sentence among similar sen...

متن کامل

Centroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on sentence utility and subsumption, which we have applied to the evaluation of both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-documen...

متن کامل

NTT/NAIST's Text Summarization Systems for TSC-2

In this paper, we describe the following two approaches to summarization: (1) only sentence extraction, (2) sentence extraction + bunsetsu elimination. For both approaches, we use the machine learning algorithm called Support Vector Machines. We participated in both Task-A (single-document summarization task) and Task-B (multi-document summarization task) of TSC-2.

متن کامل

Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization

We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel depe...

متن کامل

A Graph-based Approach to Cross-language Multi-document Summarization

Cross-language summarization is the task of generating a summary in a language different from the language of the source documents. In this paper, we propose a graph-based approach to multi-document summarization that integrates machine translation quality scores in the sentence extraction process. We evaluate our method on a manually translated subset of the DUC 2004 evaluation campaign. Resul...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Multi-Document Summarization By Sentence Extraction

نویسندگان

چکیده

منابع مشابه

Results of CRL/NYU System at DUC-2003 and an Experiment on Division of Document Sets

Centroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies

NTT/NAIST's Text Summarization Systems for TSC-2

Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization

A Graph-based Approach to Cross-language Multi-document Summarization

عنوان ژورنال:

اشتراک گذاری